145 research outputs found

    Proteomic Detection of Non-Annotated Protein-Coding Genes in Pseudomonas fluorescens Pf0-1

    Get PDF
    Genome sequences are annotated by computational prediction of coding sequences, followed by similarity searches such as BLAST, which provide a layer of possible functional information. While the existence of processes such as alternative splicing complicates matters for eukaryote genomes, the view of bacterial genomes as a linear series of closely spaced genes leads to the assumption that computational annotations that predict such arrangements completely describe the coding capacity of bacterial genomes. We undertook a proteomic study to identify proteins expressed by Pseudomonas fluorescens Pf0-1 from genes that were not predicted during the genome annotation. Mapping peptides to the Pf0-1 genome sequence identified sixteen non-annotated protein-coding regions, of which nine were antisense to predicted genes, six were intergenic, and one read in the same direction as an annotated gene but in a different frame. The expression of all but one of the newly discovered genes was verified by RT-PCR. Few clues as to the function of the new genes were gleaned from informatic analyses, but potential orthologs in other Pseudomonas genomes were identified for eight of the new genes. The 16 newly identified genes improve the quality of the Pf0-1 genome annotation, and the detection of antisense protein-coding genes indicates the under-appreciated complexity of bacterial genome organization

    In Silico Metabolic Model and Protein Expression of Haemophilus influenzae Strain Rd KW20 in Rich Medium

    Full text link
    The intermediary metabolism of Haemophilus influenzae strain Rd KW20 was studied by a combination of protein expression analysis using a recently developed direct proteomics approach, mutational analysis, and mathematical modeling. Special emphasis was placed on carbon utilization, sugar fermentation, TCA cycle, and electron transport of H. influenzae cells grown microaerobically and anaerobically in a rich medium. The data indicate that several H. influenzae metabolic proteins similar to Escherichia coli proteins, known to be regulated by low concentrations of oxygen, were well expressed in both growth conditions in H. influenzae. An in silico model of the H. influenzae metabolic network was used to study the effects of selective deletion of certain enzymatic steps. This allowed us to define proteins predicted to be essential or non-essential for cell growth and to address numerous unresolved questions about intermediary metabolism of H. influenzae. Comparison of data from in vivo protein expression with the protein list associated with a genome-scale metabolic model showed significant coverage of the known metabolic proteome. This study demonstrates the significance of an integrated approach to the characterization of H. influenzae metabolism.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/63406/1/153623104773547471.pd

    Widespread polycistronic gene expression in green algae

    Get PDF
    Polycistronic gene expression, common in prokaryotes, was thought to be extremely rare in eukaryotes. The development of long-read sequencing of full-length transcript isomers (Iso-Seq) has facilitated a reexamination of that dogma. Using Iso-Seq, we discovered hundreds of examples of polycistronic expression of nuclear genes in two divergent species of green algae: Chlamydomonas reinhardtii and Chromochloris zofingiensis Here, we employ a range of independent approaches to validate that multiple proteins are translated from a common transcript for hundreds of loci. A chromatin immunoprecipitation analysis using trimethylation of lysine 4 on histone H3 marks confirmed that transcription begins exclusively at the upstream gene. Quantification of polyadenylated [poly(A)] tails and poly(A) signal sequences confirmed that transcription ends exclusively after the downstream gene. Coexpression analysis found nearly perfect correlation for open reading frames (ORFs) within polycistronic loci, consistent with expression in a shared transcript. For many polycistronic loci, terminal peptides from both ORFs were identified from proteomics datasets, consistent with independent translation. Synthetic polycistronic gene pairs were transcribed and translated in vitro to recapitulate the production of two distinct proteins from a common transcript. The relative abundance of these two proteins can be modified by altering the Kozak-like sequence of the upstream gene. Replacement of the ORFs with selectable markers or reporters allows production of such heterologous proteins, speaking to utility in synthetic biology approaches. Conservation of a significant number of polycistronic gene pairs between C. reinhardtii, C. zofingiensis, and five other species suggests that this mechanism may be evolutionarily ancient and biologically important in the green algal lineage

    A Dynamic Noise Level Algorithm for Spectral Screening of Peptide MS/MS Spectra

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput shotgun proteomics data contain a significant number of spectra from non-peptide ions or spectra of too poor quality to obtain highly confident peptide identifications. These spectra cannot be identified with any positive peptide matches in some database search programs or are identified with false positives in others. Removing these spectra can improve the database search results and lower computational expense.</p> <p>Results</p> <p>A new algorithm has been developed to filter tandem mass spectra of poor quality from shotgun proteomic experiments. The algorithm determines the noise level dynamically and independently for each spectrum in a tandem mass spectrometric data set. Spectra are filtered based on a minimum number of required signal peaks with a signal-to-noise ratio of 2. The algorithm was tested with 23 sample data sets containing 62,117 total spectra.</p> <p>Conclusions</p> <p>The spectral screening removed 89.0% of the tandem mass spectra that did not yield a peptide match when searched with the MassMatrix database search software. Only 6.0% of tandem mass spectra that yielded peptide matches considered to be true positive matches were lost after spectral screening. The algorithm was found to be very effective at removal of unidentified spectra in other database search programs including Mascot, OMSSA, and X!Tandem (75.93%-91.00%) with a small loss (3.59%-9.40%) of true positive matches.</p

    Sulfide Generation by Dominant Halanaerobium Microorganisms in Hydraulically Fractured Shales

    Get PDF
    Hydraulic fracturing of black shale formations has greatly increased United States oil and natural gas recovery. However, the accumulation of biomass in subsurface reservoirs and pipelines is detrimental because of possible well souring, microbially induced corrosion, and pore clogging. Temporal sampling of produced fluids from a well in the Utica Shale revealed the dominance of Halanaerobium strains within the in situ microbial community and the potential for these microorganisms to catalyze thiosulfate-dependent sulfidogenesis. From these field data, we investigated biogenic sulfide production catalyzed by a Halanaerobium strain isolated from the produced fluids using proteogenomics and laboratory growth experiments. Analysis of Halanaerobium isolate genomes and reconstructed genomes from metagenomic data sets revealed the conserved presence of rhodanese-like proteins and anaerobic sulfite reductase complexes capable of converting thiosulfate to sulfide. Shotgun proteomics measurements using a Halanaerobium isolate verified that these proteins were more abundant when thiosulfate was present in the growth medium, and culture-based assays identified thiosulfate-dependent sulfide production by the same isolate. Increased production of sulfide and organic acids during the stationary growth phase suggests that fermentative Halanaerobium uses thiosulfate to remove excess reductant. These findings emphasize the potential detrimental effects that could arise from thiosulfate-reducing microorganisms in hydraulically fractured shales, which are undetected by current industry-wide corrosion diagnostics. IMPORTANCE Although thousands of wells in deep shale formations across the United States have been hydraulically fractured for oil and gas recovery, the impact of microbial metabolism within these environments is poorly understood. Our research demonstrates that dominant microbial populations in these subsurface ecosystems contain the conserved capacity for the reduction of thiosulfate to sulfide and that this process is likely occurring in the environment. Sulfide generation (also known as “souring”) is considered deleterious in the oil and gas industry because of both toxicity issues and impacts on corrosion of the subsurface infrastructure. Critically, the capacity for sulfide generation via reduction of sulfate was not detected in our data sets. Given that current industry wellhead tests for sulfidogenesis target canonical sulfate-reducing microorganisms, these data suggest that new approaches to the detection of sulfide-producing microorganisms may be necessary

    Hypergraph models of biological networks to identify genes critical to pathogenic viral response

    Get PDF
    Background: Representing biological networks as graphs is a powerful approach to reveal underlying patterns, signatures, and critical components from high-throughput biomolecular data. However, graphs do not natively capture the multi-way relationships present among genes and proteins in biological systems. Hypergraphs are generalizations of graphs that naturally model multi-way relationships and have shown promise in modeling systems such as protein complexes and metabolic reactions. In this paper we seek to understand how hypergraphs can more faithfully identify, and potentially predict, important genes based on complex relationships inferred from genomic expression data sets. Results: We compiled a novel data set of transcriptional host response to pathogenic viral infections and formulated relationships between genes as a hypergraph where hyperedges represent significantly perturbed genes, and vertices represent individual biological samples with specific experimental conditions. We find that hypergraph betweenness centrality is a superior method for identification of genes important to viral response when compared with graph centrality. Conclusions: Our results demonstrate the utility of using hypergraphs to represent complex biological systems and highlight central important responses in common to a variety of highly pathogenic viruses

    ETISEQ – an algorithm for automated elution time ion sequencing of concurrently fragmented peptides for mass spectrometry-based proteomics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Concurrent peptide fragmentation (i.e. shotgun CID, parallel CID or MS<sup>E</sup>) has emerged as an alternative to data-dependent acquisition in generating peptide fragmentation data in LC-MS/MS proteomics experiments. Concurrent peptide fragmentation data acquisition has been shown to be advantageous over data-dependent acquisition by providing greater detection dynamic range and providing more accurate quantitative information. Nevertheless, concurrent peptide fragmentation data acquisition remains to be widely adopted due to the lack of published algorithms designed specifically to process or interpret such data acquired on any mass spectrometer.</p> <p>Results</p> <p>An algorithm called Elution Time Ion Sequencing (ETISEQ), has been developed to enable automated conversion of concurrent peptide fragmentation data acquisition data to LC-MS/MS data. ETISEQ generates MS/MS-like spectra based on the correlation of precursor and product ion elution profiles. The performance of ETISEQ is demonstrated using concurrent peptide fragmentation data from tryptic digests of standard proteins and whole influenza virus. It is shown that the number of unique peptides identified from the digests is broadly comparable between ETISEQ processed concurrent peptide fragmentation data and the data-dependent acquired LC-MS/MS data.</p> <p>Conclusion</p> <p>The ETISEQ algorithm has been designed for easy integration with existing MS/MS analysis platforms. It is anticipated that it will popularize concurrent peptide fragmentation data acquisition in proteomics laboratories.</p

    Dichomitus squalens partially tailors its molecular responses to the composition of solid wood

    Get PDF
    White-rot fungi, such as Dichomitus squalens, degrade all wood components and inhabit mixed-wood forests containing both soft- and hardwood species. In this study, we evaluated how D. squalens responded to the compositional differences in softwood [guaiacyl (G) lignin and higher mannan content] and hardwood [syringyl/guaiacyl (S/G) lignin and higher xylan content] using semi-natural solid cultures. Spruce (softwood) and birch (hardwood) sticks were degraded by D. squalens as measured by oxidation of the lignins using 2D-NMR. The fungal response as measured by transcriptomics, proteomics and enzyme activities showed a partial tailoring to wood composition. Mannanolytic transcripts and proteins were more abundant in spruce cultures, while a proportionally higher xylanolytic activity was detected in birch cultures. Both wood types induced manganese peroxidases to a much higher level than laccases, but higher transcript and protein levels of the manganese peroxidases were observed on the G-lignin rich spruce. Overall, the molecular responses demonstrated a stronger adaptation to the spruce rather than birch composition, possibly because D. squalens is mainly found degrading softwoods in nature, which supports the ability of the solid wood cultures to reflect the natural environment.Peer reviewe
    corecore